Skip to content

feat: add NanoGPT provider and missing ZAI GLM-5 models#13

Open
gr3enarr0w wants to merge 1 commit into
ferro-labs:mainfrom
gr3enarr0w:feat/nanogpt-provider-zai-models
Open

feat: add NanoGPT provider and missing ZAI GLM-5 models#13
gr3enarr0w wants to merge 1 commit into
ferro-labs:mainfrom
gr3enarr0w:feat/nanogpt-provider-zai-models

Conversation

@gr3enarr0w
Copy link
Copy Markdown

Summary

  • Adds NanoGPT as a new aggregator category provider with 22 models spanning Anthropic, OpenAI, Google, xAI, DeepSeek, Mistral, and Moonshot AI. Pricing is $0.00/token — NanoGPT uses a credit-based billing model.
  • Adds three ZAI models missing from the catalog: glm-5, glm-5-turbo, glm-5.1 to match the current Z.ai API surface.

This contribution was redirected here from ferro-labs/ai-gateway#120 per maintainer guidance (MitulShah1) that model catalog data is moving to this dedicated repo from release 1.2.0.

Test plan

  • Validate YAML schema passes CI (validate.yml)
  • Confirm NanoGPT provider entry renders correctly in catalog build
  • Confirm ZAI model count increases by 3

🤖 Generated with Claude Code

Add NanoGPT as a new credit-based aggregator provider with 22 models
spanning Anthropic, OpenAI, Google, xAI, DeepSeek, Mistral, and Moonshot.
Pricing is $0.00/token as NanoGPT uses a credit-based billing model.

Also add three ZAI models missing from the catalog (glm-5, glm-5-turbo,
glm-5.1) to match the current Z.ai API surface.

Relates to ferro-labs/ai-gateway#120.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@MitulShah1
Copy link
Copy Markdown
Contributor

Thanks for this @gr3enarr0w, and for moving it over here from the gateway repo 👍

I ran the full pipeline on the branch — validate (0 errors), lint (0 errors), build (22 NanoGPT entries), and go test ./... all pass, so CI will be green. The ZAI GLM-5 half looks correct (prices null, clean structure). A few things on the NanoGPT side should be fixed before merge, since this catalog is the source of truth for pricing/capabilities that downstream consumers rely on.

1. Pricing should be null, not 0.0 (blocking)

Every NanoGPT model sets input_per_m_tokens: 0.0 / output_per_m_tokens: 0.0. Per the data convention (AGENTS.md): 0 = genuinely free, null = not applicable. Since NanoGPT is credit-based (as the PR notes) and doesn't publish per-token rates, these should be null — exactly like the ZAI files in this same PR.

This isn't cosmetic: cost-based routing treats $0/token as the cheapest option, so a 0.0 price would cause cost-optimizing strategies to route all matching traffic to NanoGPT.

2. Capabilities & limits are templated, not the real models (blocking)

The capability blocks are identical across all 22 files, which doesn't match the underlying models. Comparing nanogpt/anthropic/claude-opus-4.7 to this repo's own base providers/anthropic/models/claude-opus-4-7.yaml:

field base opus-4.7 (truth) this PR
context_window 1,000,000 200,000
max_output_tokens 128,000 16,384
vision true false
reasoning true false
prompt_caching true false

Consumers that filter by capability (e.g. vision/reasoning support) will wrongly exclude these. The OpenRouter aggregator entries already in the catalog get this right (real per-model capabilities) — that's the bar. The cleanest path may be to mirror the base model's capabilities for each underlying model.

3. Verify the model IDs + point source: at a real endpoint (please confirm)

source: is the homepage (https://nano-gpt.com) rather than a model-listing endpoint. Some IDs are verifiable (claude-opus-4.7), but others I can't confirm exist on NanoGPT's API (gpt-5.5, gemini-3.5-flash, grok-4.20, grok-4.3, deepseek-v4-pro/flash, kimi-k2.6). Could you confirm each against NanoGPT's /v1/models response and set source: to that endpoint?

4. Minor: display_name consistency

Mixed styles — "Gemini 3.5 Flash via NanoGPT" vs raw "claude-opus-4.7 via NanoGPT". Worth normalizing to the cased form.

Note on gateway integration (not blocking this PR)

Heads up that the catalog entry alone won't make NanoGPT routable in the AI Gateway — the gateway discovers models live from each provider's API and doesn't consume this catalog for routing. Actually calling NanoGPT would need a thin OpenAI-compatible provider in ai-gateway/providers/nanogpt/ (the openrouter.go pattern transfers cleanly). That's a separate PR; this one is fine as a data-only contribution.

The structure here is solid — it's really just the pricing-encoding and capability accuracy on the NanoGPT files. Happy to help if any of the underlying-model values are unclear.

@MitulShah1
Copy link
Copy Markdown
Contributor

MitulShah1 commented May 23, 2026

@gr3enarr0w One clarification on the pricing point above, so the recommendation isn't misleading:

Switching NanoGPT to null is still the right call for the catalog (it stops asserting "genuinely free," which 0.0 does). But to be transparent: on the gateway side, null and 0.0 currently produce the same routing behavior. The cost calculator treats a nil price as 0, and the cost-optimized strategy treats "model exists in catalog" as "has a price," so a null-priced model still computes to $0 and gets picked as the cheapest route.

So null is correct here, but it doesn't by itself prevent the misroute. The real fix is gateway-side: ferro-labs/ai-gateway#126. Nothing for you to change in this PR beyond what's already noted. The catalog convention stands: null = cost unknown / not per-token billed, which is exactly right for a credit-based provider like NanoGPT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants